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1. Introduction 

The field of biomedical signal processing has given rise to a number of techniques for 
assisting physicians with their everyday tasks of diagnosing and monitoring medical 
disorders. Analysis of the electrocardiogram (ECG) provides a quantitative description 
of the heart's electrical activity and is routinely used in hospitals as a tool for identifying 
cardiac disorders. 

A large variety of signal processing techniques have been employed for filtering the 
raw ECG signal prior to feature extraction and diagnosis of medical disorders. A typical 
ECG is invariably corrupted by (i) electrical interference from surrounding equipment 
{e.g. effect of the electrical mains supply), (ii) measurement (or electrode contact) noise, 
(iii) electromyographic (muscle contraction), (iv) movement artefacts, (v) baseline drift 
and respiratory artefacts and (vi) instrumentation noise (such as artefacts from the 
analogue to digital conversion process) (Friesen et al. 1990). 

Many techniques may be employed for filtering and removing noise from the raw 
ECG signal, such as wavelet decomposition (Nikolaev et al. 2000), Principal Component 
Analysis (PGA) (Paul et al. 2000), Independent Component Analysis (ICA) (Potter 
et al. 2002), nonlinear noise reduction (Schreiber & Kaplan 1996) and traditional 
Wiener methods. The ECG forms the basis of a wide range of medical studies, 
including the investigation of heart rate variability, respiration and QT dispersion (Malik 
& Camm 1995). The utility of these medical indicators relies on signal processing 
techniques for detecting R-peaks (Pan & Tompkins 1985), deriving heart rate and 
respiratory rate (Moody et al. 1985), and measuring QT-intervals (Davey 1999). 

Despite the numerous techniques that may be found in the literature and those 
that are now freely available on the Internet (Goldberger et al. 2000), it remains 
extremely difficult to evaluate and contrast their performance. The recent proliferation 
of biomedical databases, such as Physiobank (Goldberger et al. 2000), provides a 
common setting for comparing techniques and approaches. While this availability of 
real biomedical recordings has and will continue to advance the pace of research, the 
lack of internationally agreed upon benchmarks means that it is impossible to compare 
competing signal processing techniques. The definition of such benchmarks is hindered 
by the fact that the true underlying dynamics of a real ECG can never be known. This 
void in the field of biomedical research calls for a gold standard, where an ECG with 
well-understood dynamics and known characteristics is made freely available. 

The model presented here, known as EGGSYN (synthetic electrocardiogram), is 
motivated by the need to evaluate and quantify the performance of the above signal 
processing techniques on ECG signals with known characteristics. While the Physionet 
web-site (Goldberger et al. 2000) already contains a synthetic ECG generator (Ruha & 
Nissila 1997), this is not intended to be highly realistic. The model and its underlying 
algorithm described in detail in this paper is capable of producing extremely realistic 
ECG signals with complete fiexibility over the choice of parameters that govern the 
structure of these ECG signals in the temporal and spectral domains. In addition. 
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the average morphology of the ECG may be specified. In order to facihtate the use 
of ECGSYN, software has been made freely available as both Matlab and C code %. 
Furthermore, users can employ ECGSYN over the Internet using a Java applet, which 
provides a means of downloading an ECG signal with characteristics selected from a 
graphical user interface. 

2. Background 

The average heart rate is calculated by first measuring the time interval, denoted RR 
interval, between two consecutive R peaks (Fig. 1), taking the average reciprocal of 
this value over a fixed window (usually 15, 30 or 60 seconds) and then scaling to units 
of beats per minute (bpm). A time series of RR intervals is often referred to as an 
RR tachogram and the variation in this time series is governed by the balance between 
the sympathetic {fight and flight) and parasympathetic {rest and digest) branches of the 
central nervous system, known as the sympathovagal balance. In general, innervation of 
the fast acting parasympathetic branch decreases heart rate, whereas the (more slowly 
acting) sympathetic branch increases heart rate. This RR tachogram can therefore 
be used to estimate the effect of both these branches. A spectral analysis of the RR 
tachogram is usually divided into main frequency bands, known as the low-frequency 
(LF) band (0.04 to 0.15 Hz) and high-frequency (HE) band (0.15 to 0.4 Hz) (Task 
Force of the European Society of Cardiology et al. 1996). Sympathetic tone is believed 
to affect the LF component whereas both sympathetic and parasympathetic activity 
influence the HE component (Malik & Camm 1995). The ratio of the power contained 
in the LF and HE components has been used as a measure of the sympathovagal balance 
(Malik & Camm 1995, Task Force of the European Society of Cardiology et al. 1996). 
The structure of the power spectrum of the RR tachogram tends to vary from 
person to person with a number of spectral peaks associated with particular biological 
mechanisms (McSharry et al. 2002, Stefanovska et al. 2001). While the correspondence 
between these mechanisms and the positions of spectral peaks are strongly debated, 
there are two peaks which usually appear in most subjects. These are due to Respiratory 
Sinus Arrhythmia (RSA) (Hales 1733, Ludwig 1847) caused by parasympathetic activity 
which is synchronous with the respiratory cycle and Mayer waves caused by oscillations 
in the blood pressure waves (De Boer et al. 1987). RSA usually gives rise to a peak 
in the HE region around 0.25 Hz, corresponding to 15 breaths per minute, whereas the 
Mayer waves cause a peak around 0.1 Hz. 

3. Method 

The dynamical model, ECGSYN, employed for generating the synthetic ECG is 
composed of two parts. Firstly, an internal time series with internal sampling frequency 
/int is produced to incorporate a speciflc mean heart rate, standard deviation and 

I www.physionet.org/physiotools/ecgsyn 
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Figure 1. Two seconds of synthetic ECG reflecting the electrical activity in the heart 
during two beats. Morphology is shown by five extrema P,Q,R,S and T. Time intervals 
corresponding to the RR interval and the QT interval are also indicated. 



spectral characteristics corresponding to a real RR tachogram. Secondly, the average 
morphology of the ECG is produced by specifying the locations and heights of the 
peaks that occur during each heart beat. A flow chart of the processes in ECGSYN for 
producing the ECG is shown in Fig. 2. 

The spectral characteristics of the RR tachogram, including both RSA and Mayer 
waves, are replicated by specifying a bi-modal spectrum composed of the sum of two 
Gaussian functions. 
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with means /i,/2 and standard deviations Ci,C2. Power in the LF and HF bands are 
given by af and a| respectively whereas the variance equals the total area a^ = a^ + cr|, 
yielding an LF/HF ratio of crl/al- 

A time series T{t) with power spectrum S{f) is generated by taking the inverse 



Fourier transform of a sequence of complex numbers with amplitudes J S{f) and phases 
which are randomly distributed between and 27r. By multiplying this time series by 
an appropriate scaling constant and adding an offset value, the resulting time series 
can be given any required mean and standard deviation. Different realisations of the 
random phases may be specified by varying the seed of the random number generator. 
In this way, many different time series T(t) may be generated with the same temporal 
and spectral properties. 

The ECG traces a quasi-periodic waveform with each beat of the heart, with the 
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Figure 2. ECGSYN flow chart describing tlie procedure for specifying the temporal 
and spectral description of the RR tachogram and ECG morphology. 



morphology of each cycle labeled by its peaks and troughs, P, Q, R, S and T, as shown 
in Fig. 1. This quasi-periodicity can be reproduced by constructing a dynamical model 
containing an attracting limit cycle; each heart beat corresponds to one revolution 
around this limit cycle, which lies in the (x, |/)-plane as shown in Fig. 3. The morphology 
of the ECG is created by using a series of exponentials to force the trajectory to trace 
out the PQRST-waveform in the z-direction. A series of five angles, (^p, 9q^ ^r, 9s-, 
9t), are used to specify the extrema of the peaks (P,Q,R,S,T) respectively. 

The dynamical equations of motion are given by three ordinary differential 
equations (McSharry et al. 2003), 

X = ax — uy, 
y = ay + ux, 
z= - J2 a.A^,exp(-A^,V262) - (^ - zo), (2) 

ie{P,Q,R,S,T} 

where a = 1 — \Jx^ + y^, A^j = {9 — 9i) mod 27r, 9 = atan2(|/,a;) and uj is the angular 
velocity of the trajectory as it moves around the limit cycle. The coefficients a^ govern 
the magnitude of the peaks whereas the bi define the width (time duration) of each 
peak. Baseline wander may be introduced by coupling the baseline value zq in (2) to the 
respiratory frequency /2 in (1) using zo{t) = A sm{27i f2t) . The output synthetic ECG 
signal, s{t), is the vertical component of the three-dimensional dynamical system in (2): 
s{t)=z{t). 
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Figure 3. Three-dimensional state space of the dynamical system given by (2) showing 
motion around the limit cycle in the horizontal (x, y)-plane. The vertical z-component 
provides the synthetic ECG signal with a morphology that is defined by the five 
extrema P,Q,R,S and T. 



Having calculated the internal RR tachogram expressed by the time series T(t) 
with power spectrum S{f) given by (1), this can then be used to drive the dynamical 
model (2) so that the resulting RR intervals will have the same power spectrum as 
that given by S{f). Starting from the auxiliary§ time t„, with angle 6 = 9r, the time 
interval T(t„) is used to calculate an angular frequency Qn = wtT' '^^^^ particular 
angular frequency, Qn, is used to specify the dynamics until the angle 9 reaches Or 
again, whereby a complete revolution (one heart beat) has taken place. For the next 
revolution, the time is updated, t„+i = t„ + T(t„), and the next angular frequency, 
, is used to drive the trajectory around the limit cycle. In this way, the 
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internally generated beat-to-beat time series, T(t), can be used to generate an ECG 
with associated RR intervals that have the same spectral characteristics. The angular 
frequency uj{t) in (2) is specified using the beat-to-beat values f2„ obtained from the 
internally generated RR tachogram: 

Uj{t) = Qn, tn <t < tn+l- (3) 

Given these beat-to-beat values of the angular frequency a;, the equations of motion 
in (2) are integrated using a fourth-order Runge-Kutta method (Press et al. 1992). The 
time series T{t) used for defining the values of Vtn has a high sampling frequency of 
/int, which is effectively the step size of the integration. The final output ECG signal 
is then down-sampled to /ecg if /int > /ccg by a factor j^ to generate an ECG at the 
requested sampling frequency. For simplicity, /int is taken as an integer multiple of /ecg 
and anti-aliasing filtering is therefore not required if /ecg is chosen to be sufficiently high. 

§ This auxiliary time axis is used to calculate the values of f7„ for consecutive RR intervals whereas 
the time axis for the ECG signal is sampled around the limit cycle in the [x, y)-plane. 
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Figure 4. Synthetic ECG signals for different mean heart rates: (a) 30 bpm, (b) 60 
bpm and (c) 120 bpm. 

Table 1. Morphological parameters of the ECG model with modulation factor 

a = \/ft.moan/60. 



Index (i) 



R 



T 



Time (sees) -0.2 -0.05 0.05 0.3 

di (radians) ~^TTy/a — j^ttq: t^'^'Q^ 5'"' 

a^ 1.2 -5.0 30.0 -7.5 0.75 

b, 0.25a 0.1a 0.1a 0.1a 0.4a 



The size of the mean heart rate affects the shape of the ECG morphology. An 
analysis of real ECG signals for different heart rates shows that the intervals between 
the extrema vary by different amounts; in particular the QRS width decreases with 
increasing heart rate. This is as one would expect; when sympathetic tone increases the 
conduction velocity across the ventricles increases, together with an augmented heart 
rate. The time for ventricular depolarisation (represented by the QRS complex of the 
ECG) is therefore shorter. These changes are replicated by modifying the width of the 
exponentials in (2) and also the positions of the angles 6. This is achieved by using a 
heart rate dependent factor a = \//imcan/60 where /imcan is the mean heart rate expressed 
in units of bpm (see Table 1). 

Operation of ECGSYN, composed of the spectral characteristics given by (1) and 
the time domain dynamics in (2), requires the selection of the list of parameters given 
in Tables 1 and 2. 
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Table 2. Temporal and spectral parameters of the ECG model 

Description Notation Default values 

Approximate number of heart beats 

ECG sampling frequency 

Internal sampling frequency 

Amplitude of additive uniform noise 

Heart rate mean 

Heart rate standard deviation. 

Low frequency 

High frequency 

Low frequency standard deviation 

High frequency standard deviation 

LF/HF ratio 

4. Results 

The synthetic ECG provides a reahstic signal for a range of heart rates. Figure 4 
illustrates examples of the synthetic ECG for three different heart rates; 30 bpm, 60 bpm, 
and 120 bpm. Notice that the PR, QT and QRS widths all shorten with increasing heart 
rate. It is important to note that the nonlinear relationship between the morphology 
modulation factor a and mean heart rate /imcan limits the contraction of the overall 
PQRST morphology relative to the refractory period (the minimum amount of time in 
which depolarisation and repolarisation of the cardiac muscle can occur). 

The ability of ECGSYN to generate ECG signals with known spectral 
characteristics provides a means of testing the effect of varying the ECG sampling 
frequency /ecg on the estimation of heart rate variability (HRV) metrics. Figure 5 
illustrates the increase in estimation accuracy of a HRV metric, the LF/HF ratio, with 
increasing /ccg. The error bars represent one standard deviation on either side of the 
means (dots) of each 1000 Montecarlo runs. The true input LF/HF ratio was 0.5 as 
shown by the horizontal line. The synthetic ECG signals had a mean heart rate of 
60 bpm and a standard deviation of 3 bpm. The method used for estimating the 
LF/HF ratio, the Lomb periodogram, introduces negligible variance into the estimate 
(Clifford 1998), and therefore the downward bias of the estimates can be considered 
due to /ccg being too low. Note that below 512 Hz, the LF/HF ratio is considerably 
underestimated. This is consistent with studies performed on real data (Abboud & 
Barnea 1995). 

5. Discussion 

A dynamical model known as ECGSYN has been presented that generates realistic 
synthetic ECG signals. The user can specify both the temporal and spectral 
characteristics of the ECG. In addition, the average morphology of the ECG may be 
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Figure 5. LF/HF ratio estimates computed from synthetic ECG signals for a range 
of sampling frequencies using an input LF/HF ratio of 0.5 (horizontal line). The 
distribution of estimates is shown by the mean (dot) and plus/minus one standard 
deviation error bars. The simulations used 100 realisations of noise-free synthetic 
ECG signals with a mean heart rate of 60 bpm and standard deviation of 3 bpm. 



input into the algorithm. Open-source software for the algorithm underlying ECGSYN 
is freely available in both Matlab and C A Java applet facilitates the generation of ECG 
signals over the Internet with characteristics selected using a graphical user interface. 

By examining the statistical properties of artificially generated ECG signals, it has 
been shown that estimates of HRV using the LF/HF ratio depend on the sampling 
frequency, /ecg, of the ECG. Small values of /ccg gives rise to ECG signals which lead 
to underestimated LF/HF ratios. This provides a basis for the low sample frequency 
problem in HRV studies (Abboud & Barnea 1995). In addition, these results provide 
a guide for physicians when selecting the sampling frequency of the ECG based on the 
required accuracy of the HRV metrics. 

The availability of ECGSYN through open-source software and the ability to 
generate collections of ECG signals with carefully controlled and a priori known 
characteristics will allow biomedical researchers to test and provide operation statistics 
for new signal processing techniques. This will enable physicians to compare and 
evaluate different techniques and to select those that best suit their requirements. 
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